Reorganize Space::acquire #1381

wks · 2025-09-01T09:57:03Z

We rewrite the function into a more linear style and remove duplicated code of handling blocking for GC.

Although we do not intend to change the logic of Space::acquire in this PR, we still fix some obvious bugs in the previous Err(_) => { ... } code path.

Previously the on_pending_allocation after failing to get new pages from PageResource did not include side metadata. Now it always includes side metadata.

We rewrite the function into a more linear style and remove duplicated code of handling blocking for GC. Although this PR does not intend to change the logic of Space::acquire, it does fix some obvious bugs. - Previously if the current thread is a mutator, or GC is not enabled, the thread will attempt to poll and block for GC if it fails to get new pages from the PageResource. Now it panicks. - Previously the `on_pending_allocation` after failing to get new pages from PageResource did not include side metadata. Now it always includes side metadata.

wks · 2025-10-09T07:45:49Z

This PR is now ready for review. This PR is a preparatory step for #1382 which we recently proposed. This PR does not change the semantics of Space::acquire (except for obvious bugs), but should make the following decisions easier to handle, in terms of how they affect the control flow: (1) whether we can block for GC (depending on whether we are at safepoint), (2) whether we can over-commit, and (3) whether we are allowed to poll.

wks · 2025-10-09T09:41:28Z

The test case mock_test_allocate_without_initialize_collection failed and entered an infinite loop. When GC is not initialized, it should panic, but somehow failed to panic. I'll try to make fewer changes.

qinsoon · 2025-10-09T23:09:26Z

I agree that removing duplicate code is a good idea, but I’m not convinced that refactoring acquire into a more 'linear' style improves readability.

The original logic was structured roughly like this:

if should_poll && poll {
	do gc
} else {
	allocate
	if success {
		mmap/zero/assert
	} else {
		poll
		do gc
	}
}

After refactoring, it becomes:

if should_poll {
	poll
}
if should_allocate {
	allocate
	if success {
		mmap/zero/assert
		return
	}
}
if failed_allocation {
	poll
}
do gc

In the original impl, the cascading if/else structure made the control flow and conditions explicit -- it’s immediately clear which path you’re in and why. The new 'linear' version spreads out the conditions, so when reading later parts of the function, you need to track all preceding conditions and be aware of early returns. This makes it harder to reason about the logic.

The 'do gc' part might be the only significant duplicated code here. It is reasonable to just extract it into a closure and reuse.

wks · 2025-10-10T02:06:45Z

The advantage of the new style is that the condition for polling, allocating, and doing GC are independent. It matches the new API proposed in #1400 (i.e. using boolean options) better. For example, the current control flow is

if should_poll && poll() {
    schedule and block for GC
} else {
    allocate
    if failed {
        poll
        schedule and block for GC
    }
}

If we allow both polling and over-committing, the control flow will become

if should_poll && poll() && !allow_overcommit {
    schedule and block for GC // reach here when disallowing overcommit
} else {
    allocate
    if failed {
        poll
        schedule and block for GC // reach here when allowing overcommit
    }
}

First of all, the !allow_overcommit part in the if statement is confusing. should_poll && poll() && !allow_overcommit does two things:

Conditionally call poll(). It should call poll() if should_poll is true.
Decide the condition of whether to attempt allocation, or go to the code path of "scheduling and blocking for GC". It should try allocating if GC is not triggered or over-committing is allowed.

I think it is bad to mix control flow (to call poll(), or not to call poll()) and condition computing (whether to allocate) in the condition part of the if statement. That's why I lifted polling into its own if statement: if should_poll { gc_triggerd = poll() }, and compute the condition of whether to attempt allocation separately. As I mentioned in the comment, we can change let should_try_to_allocate = !gc_triggered; to let should_try_to_allocate = !gc_triggered || allow_overcommit; later.

Secondly, the duplicated code blocks for schedule and block for GC becomes more confusing if we add over-committing into the play. Now if allow_overcommit is true, it will enter the second schedule and block for GC instead of the first. It is confusing because both code blocks handle the case that we can't allocate (whether we attempted allocation or not).

The new style emphasizes that the main thing Space::acquire does is allocating. That's why if should_allocate { allocate } is the centerpiece of this function. One advantage of this is that we guarantee the self.common().acquire_lock is always released after the if should_allocate { allocate } block.

Speaking of early returns, there are only two early returns in this function.

One is when the allocation is successful. I think this is a proper use of early return, because the rest part of this function handles allocation failure.
The other early return is when alloc_options.on_fail.allow_gc() is false. I am OK with converting this early return into if ... { return 0; } else { ... } return 0; or simply if allow_gc() { ...} return 0; But note that Use boolean allocation options #1400 adds an at_safepoint option, and it will be used as part of this condition, too.

I think if failed_allocation { poll } expresses its intent. But since #1400 also adds an allow_polling option, the condition will be different after that. And if we are not allowed to poll, we should not block for GC, either.

wks · 2025-10-10T02:11:56Z

Oops I realized that the code in the error-handling trailer. VM::VMCollection::block_for_gc should only be called if gc_triggered is true. But we haven't thought through the case where allow_polling == false && at_safepoint == true. Maybe allow_polling is not the right option in the first place, and it should be eager_polling which only controls the first poll but not the second poll (which is a force poll and must succeed).

wks mentioned this pull request Oct 9, 2025

Change the semantics of OverCommit to allow polling #1382

Draft

wks added 2 commits October 9, 2025 15:25

Merge branch 'master' into fix/reorganize-space-acquire

f9328fa

Merge branch 'master' into fix/reorganize-space-acquire

a52d5ff

wks marked this pull request as ready for review October 9, 2025 07:40

wks added the PR-extended-testing Run extended tests for the pull request label Oct 9, 2025

wks requested a review from qinsoon October 9, 2025 07:41

wks added 2 commits October 9, 2025 17:41

Make fewer changes

4219f15

Add another log, just to be consistent

06680f1

wks mentioned this pull request Oct 9, 2025

Use boolean allocation options #1400

Draft

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reorganize Space::acquire #1381

Reorganize Space::acquire #1381

wks commented Sep 1, 2025 •

edited

Loading

Uh oh!

wks commented Oct 9, 2025

Uh oh!

wks commented Oct 9, 2025

Uh oh!

qinsoon commented Oct 9, 2025

Uh oh!

wks commented Oct 10, 2025

Uh oh!

wks commented Oct 10, 2025

Uh oh!

Uh oh!

Reorganize Space::acquire #1381

Are you sure you want to change the base?

Reorganize Space::acquire #1381

Conversation

wks commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wks commented Oct 9, 2025

Uh oh!

wks commented Oct 9, 2025

Uh oh!

qinsoon commented Oct 9, 2025

Uh oh!

wks commented Oct 10, 2025

Uh oh!

wks commented Oct 10, 2025

Uh oh!

Uh oh!

wks commented Sep 1, 2025 •

edited

Loading